We broke AI guardrails down to six categories.
We curated datasets and models that demonstrate the state of AI safety using LLMs and other open source models.
Developer | Model | Latency | Outcome |
---|---|---|---|
Guardrails AI | Toxic Language | ||
Natural Language Content Safety | |||
Microsoft | Azure Content Safety |
Developer | Samples |
---|---|
toxic | 6090 |
obscene | 3691 |
insult | 3427 |
identity_hate | 712 |
severe_toxic | 367 |
threat | 211 |